1. Introduction
The Bilingual Road scene Video Text Dataset (BiRViT1K) was constructed by the National Laboratory of Pattern Recognition (NLPR), Institute of Automation of Chinese Academy of Sciences (CASIA). It contains 1000 videos, including 300 Chinese videos, 300 English videos and 400 bilingual videos. We annotate a total of 64,001 frames with 806,011 text instances in line-level, and every text instance is labeled with a quadrilateral, a transcript and a tracking identification (ID). We randomly select 70% of the videos of each type as the training set and the rest as the test set, so the training set contains 44,808 frames from 700 videos and the test set contains 19,193 frames from 300 videos. Fig. 1 shows some images of different scenes in this dataset.
Fig. 1 Some images of different scenes in the BiRViT1K.s
2. Annotations
We annotate the text instances in videos, including Chinese, English, Arabic numerals, common symbols (e.g. commas, periods and spaces). And in this dataset, we use the quadrilateral annotation format. For each text instance, its label includes the coordinates of the four corners of the text box, the transcripts and the tracking identification (ID). If the text instance is less recognizable or most of the area is truncated, we record its transcript as "###". Fig. 2 shows some annotations of video frames. As shown in Fig. 1. and Fig. 2, the scale of the text instances in our dataset is small, and the forms are diverse (license plates, shop names, traffic signs, etc.), which makes it more challenging.
Fig. 2 The annotations of video frames.
3. Dataset Format
We provide two label formats:
(1) A txt file is provided for each image, each line represents a text instance, including corner coordinates, text content and ID,
separated by '\t';
"x1,y1,x2,y2,x3,y3,x4,y4 text ID"
(2) A json file is provided for traing set and test set respectively. The format of the annotation file is as follows:
4. Condition of Use
Contact
Fei Yin (fyin@nlpr.ia.ac.cn)
National Laboratory of Pattern Recognition (NLPR)
Institute of Automation of Chinese Academy of Sciences
95 Zhongguancun East Road, Beijing 100190, P.R. China
24th International Conference on Pattern Recognition
15th International Conference on Frontiers in Handwriting Recognition
10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition
Haidian | Beijing | China
Phone : (+86-10)8254-4797
Fax : (+86-10) 8254-4594
Email:liucl@nlpr.ia.ac.cn
Website:www.nlpr.ia.ac.cn/pal/